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populations  associated  with  large  location  parameter  e.  To 

this  end  compute  H  =  ^  R..  where  R.  .  =  [#  of  X- , •  <  X-- 

1  j  =  l  n  13  1  3  “ 

(i  £  £  k)]»  and  X.  =  n  'j  X..,  and  base  the  terminal 

3  =  1  3 

statistical  decision  on  ^,...,5^  (means  procedure  <?Mp)  or 
H^,...,Hk  (vector  rank  procedure  ?y) .  Fix  t  (1  £  t  <  k)  and 
consider  the  problem  of  selecting  populations  associated  with 
the  t  largest  0's  based  on  or  H  .  -  -  ,  H  ^  .  In  this 

paper  we  investigate  large  sample  behavior  (as  well  as  some 
fixed  sample  behaviour)  of  5^.  The  asymptotic  relative 
efficiency  of  ^  with  respect  to  is  also  studied. 
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0 .  Summary 

Consider  n  blocks  of  k  observations  (X-^- ,  .  .  .  )  , 

j  =  1, . . . ,n.  Suppose  are  independent  and  P(X^j  £  x)  = 

F(x-  nj  -  0^)  where  Hj  is  the  nuisance  location  parameter 
of  the  block  and  6^  is  the  location  parameter  corre¬ 

sponding  to  population  uu  (1 £  j  <  n,  1  £  i 5  k) . 

We  are  interested  in  selecting  populations  associated  with 

n 

large  location  parameter  0.  To  this  end  compute  H,  =  £  R.  - 

j  =  l  13 

where  R..  =  [#  of  X.  ,.£X..  (l£i'£k)],  and 

X  J  ■*'  J  *•  J 

-  -1  n 

X.  =  n  l  X..  and  base  the  terminal  statistical  decision 

1  •  X  "1  h 

3  =  1  J  ’ 

on:  X^,...,Xk  (means  procedure  Pjjp)  or  (vector 

rank  procedure  Py).  Fix  t  (l£  t<  k)  and  consider  the  prob¬ 
lem  of  selecting  populations  associated  with  the  t  largest 
0’s  based  on:  or 
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In  this  paper  we  investigate  large  sample  behavior  (as 
well  as  some  fixed  sample  behavior)  of  Py.  The  asymptotic 
relative  efficiency  of  Py  with  respect  to  PMp  is  also 


MP 


studied. 


1.  Introduction 

Let  X^.  (j=l,2,...,n^;  i=l,2,...,k)  be  independent 

random  samples  drawn  from  populations  ^  ,  .  .  .  ,tt^  with  ab¬ 
solutely  continuous  distribution  functions  ( df  1  s  )  F(x-0-). 


Let  C[p]  - 


<  9[^]  denote  the  ordered  values  of  the  un¬ 


known  0^,  and  let  it  ^  ...  denote  the  population  associated 
with  ®[i]’  these  associations  are  assumed  completely  unknown, 
Often  for  some  fixed  t  (lit<k)  an  experimenter  is  inter¬ 
ested  in  the  problem  of  selecting  the  "so-called”  t  best 
populations  ,  ^(k-t  +  l)  ’  ‘  ‘  ,7I( k) '  ^or  se^ec‘ti°n  the 

t  best  populations,  Bechhofer  (1954)  proposed  the  means  pro¬ 
cedure  (denoted  by  F^p )  which  selects,  as  being  the  t  best 
populations ,  the  t  populations  yielding  the  t  highest  sample 


means 


X<(=  nT  V  X..):  Bechhofer  requires  that  the  proba- 
l  i  •  r,  il 


3  =  1 


bility  that  the  so-selected  t  populations  are  the  t  best 
[when  this  occurs,  a  Correct  Selection  ( CS )  is  said  to  occur] 

j.  k  -1 

be  at  least  P"  (a  prespecified  constant  between  (^)  and  1) 
whenever  0 £}c—  “t  + 1  ]  ~  ®[k-t]  *  (6*  is  a  prespecified 

positive  constant).  A  different  procedure  was  proposed  by 
Gupta  (1956,  1965):  rather  than  selecting  the  t  populations 
associated  with  the  t  highest  sample  means,  he  selects  a 


subset  of  the  k  populations  (retaining  in  the  selected  sub¬ 
set  all  the  populations  yielding  sample  means  close  to  the  t 
highest  sample  means)  and  requires  that  the  probability  be 
at  least  P*  that  the  selected  subset  contains  the  t  best 
(when  this  occurs,  a  CS  is  said  to  occur).  Both  Bechhofer 
and  Gupta  considered  the  case  of  normal  distributions  with 
common  known  variances;  for  the  case  of  normal  distributions 
but  with  (possibly  different)  unknown  variances  the  reader 
is  referred  to  Dudewicz  and  Dalai  (1976).  The  robustness  of 
the  means  procedure  is  broached  in  Lehmann  (1963)  and  is  under 
investigation,  in  a  more  general  context,  by  one  of  the 
authors  [YJL]. 

Lehmann  (1963)  and  Bartlett  and  Govindaraj ulu  (1968) 
based  selection  procedures  on  the  joint  ranks  of  the  obser¬ 
vations  in  the  combined  sample  of  N  =  £  observations. 

Specifically,  each  observation  X^j  is  assigned  a  score 
=  E[Z(R^ j ) jG]  where  Z(l)  <  ...  <  Z(N)  denotes  an 
ordered  sample  from  any  continuous  df  G  and  R^. 
denotes  the  rank  of  in  the  combined  sample.  The  se¬ 

lection  procedures  are  then  based  on  the  quantities 
n  •  ^  7  a  .  .  (1-i-k).  Lehmann's  approach  uses  a  Bechhofer- 

1  j  13 

type  ( indifference-zone)  approach  while  Bartlett  and 
Govindara julu  use  a  Gupta-type  (subset-selection)  approach. 
Bartlett  and  Govindara j ulu  also  base  some  selection  pro¬ 
cedures  on  randomized  scores  (i.e.,  quantities  n^yZiR^j) 
(1-i-k));  but  we  have  shown  (details  will  not  be  given 
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here)  that  in  selection  procedures  based  on  randomized  scores 
the  probability  of  CS  (denoted  by  P[CS])  is  bounded  away  from  1 
for  any  configuration  of  parameters  and  two  different  statis¬ 
ticians  reach,  with  positive  probability,  two  different  con¬ 
clusions  from  the  same  set  of  observations.  This  extends 


results  of  Jogdeo  (1966)  to  ranking  and  selection  problems. 

An  extensive  review  of  other  selection  procedures  (including 
joint  rank  procedures)  is  provided  in  Lee  and  Dudewicz  (1974). 

The  model  usually  assumed  in  the  literature  is  that  of 
the  one-way  analysis-of-variance  model.  The  selection  pro¬ 
cedure  investigated  in  this  paper  arises  from  the  two-way 
analysis-of-variance  type  model  where  block  effect  enters: 
namely  P(X_^  5  x)  =  F(x  -  -  8^)  where  Hj  is  a  nuisance  lo¬ 

cation  parameter  of  the  jTh  block  (1<  j  <  n)  .  In  this  case 
ranks  within  each  block  are  preferable  to  joint  ranks. 

McDonald  (1972,  1973)  makes  subset-selection  approaches  to  a 
selection  problem  by  basing  terminal  decision  rules  on  ranks 
within  each  block,  and  Dudewicz  and  Fan  (1973)  suggested  an 
indifference-zone  approach.  In  Section  2  we  investigate,  by 
an  indifference-zone  approach,  selection  procedures  based  on 


ranks  within  each  block  (we  denote  this  procedure  by  P„)  under 


the  slit 


)arameter  configuration  (SPC)  0, 


6[k-t]  < 


0[k_t  +  i]  =  .*■  =  all  the  results  are  asymptotic.  In 

Section  3,  we  investigate  the  asymptotic  relative  efficiency 


(ARE)  of  Py  with  respect  to  P^p  as  ®[k-t+l]~  9[k-t]  tends 
to  zero  under  the  SPC  assumption.  The  configuration  of  8^*s 
minimizing  PCCSjPy]  is  investigated  in  Section  4;  in  particular 
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we  show  that  the  SPC  is  not  necessarily  least-favorable.  In 
Section  5  we  discuss  practicality  of  the  assumption  of  the 
SPC  as  an  underlying  configuration.  In  this  article  we  de¬ 
note  by  0(a)  a  positive  quantity  such  that  a-10(a)  converges 
to  a  positive  constant  in  the  limit  of  a. 

2.  P[CS|Py]  under  the  SPC:  Asymptotic  Results 

We  make  the  following  probability  requirement  for  given 
6*  and  P*  (<S*  >  0,  (^)_1 <  P*  <  1): 

(2.1)  Probability  requirement:  We  select  the  populations 

11  (k-t+1 )  ’  ’  *  ’  ,7T(k  )  we  make  a  CS)  with  probabil¬ 
ity  P[CS]  2  P*  whenever  +  "  0[^-t]  2  5*' 

Consider  the  following  single-stage  procedure:  Take  n 
independent  vectors  X  ^  =  (X2  ^  ,  .  .  .  ,Xk  ^  )  (Is  jin)  (X.^  de¬ 
notes  the  jth  observation  from  tt  .  )  :  compute  H  -  =  V  R.. 

11  j=i 

(lsisk)  where  =  (#  of  ^i'j-^ij  (lsi’sk)};  and 

select  (as  being  the  t  best  populations)  the  populations 
associated  with  the  t  highest  fh's  (breaking  ties,  if  any, 
by  randomization). 

We  first  consider  t=  1  and  then  generalize  to  t>  1. 

Let 

^0  ~  10  6  ^k’  ®  ~  (8j-i],...,B[-k])}, 

n9t4*'t>  =  ‘  V  6tk-tU]  -  9Ek-tJ  S  «*>- 


— 


1 


”0'  "[1]  •  *  "  u[k-t]  “  u[k-t+l]  u 


6[k-t  +  l]  =  *  * ‘  =  e[k] 


,  6*  >  0  J  , 


Vt):  0m  =  *•* =  0ck-t] =  etk-t+i3  “ 6  ’  eck-t+i]=  ••• =  e[kr 


Lemma  2 ■ 1 ;  For  selection  of  tt^)  under  Uq(6*,1),  P [ CS | D 
is  a  nondecreasing  function  of  6r,  , .  Hence  inf  P[CS|P.,]  = 

L  k  J  .  .  ,  x  *  n  ,  V 


P[CS|PV,J0(1)]. 


u)e(6*,l) 


Proof 


See  Theorem  3.1  of  McDonald  (1972). 


Now  we  wish  to  determine  a  sample  size  nQ  which  will 

D 

guarantee  P[CS|Py  ,  0  €og(6*,l)]  to  be  at  least  P*  for 
given  6*  >  0,  but  we  do  not  know  how  to  determine  the  sample 
size  for  given  ?A  and  5*.  Rather,  we  find  6*  for  given  P*  and 
sample  size  n,  namely  we  put  6*  as  a  function  of  n  and  P* 
and  then  solve  n  for  given  6*  and  P*.  (This  method  was  in¬ 
troduced  by  Lehmann  (1963). )  To  this  end  we  need  to  investi¬ 
gate  the  asymptotic  determination  of  P[CS|Py]  under  the 
following  configuration  with  t=  1: 


(2.2)  6n(t,n)  :  0M  ,  =  ...=  9. 


[k-t]’  6[k-t  +  l]  "  0[k-tJ 


=  6(n)  , 


9[k-t  +  l]  =  • '  •  =  0[xr 


net  h’(£)  the  sum  of  rank  scores  yielded  by  tt^j.  To  show 

the  dependence  of  ^(p)  n  »  we  write  H^j(n),  and  for  the 

notational  convenience,  without  loss  of  generality,  we  let 


- — 


covered  after  Theorem  2.5),  and  assume 


(2.4) 


f  (x)dx  <  “ 


(f (x)  =  pdf  of  the  underlying  df  F) 


J. 


(For  some  pdf’s  j  f^(x)dx  does  not  exist.  Df's  satisfying 
(2.4)  are  characterized  by  Lemma  1.4.1  of  Kagan,  Linnik  and 
Rao  (1973).) 

Lemma  2.3:  Under  the  configuration  of  §g(l,n),  defining 
V ■ ( n )  =  (n)  -  H. (n))  (1  S  i  5  k-1)  , 

1  K  i 

we  find  that 


(2.5) 

lim 

n-+» 

E[V.(n) ]  =  6k  j  f  (x)dx 

(1  5 

i  5  k-1), 

(2.6) 

lim 
n -*■<*> 

Var[V.(n)]  =  k(k+l)/6 

(1  ± 

i  5  k-1). 

and 

(2.7) 

lim 

Cov[V.(n),Vif(n)]  =  k(k+l)/12 

(1  ^  i  *  i'  i  k-1) 

Proof 

Defining 

P^1*  =  P^xil  has  the  rt  l  rank  among  X-^  , . . . ,Xkl | $Q( 1 ,n) ] , 
we  have 

in  ,  k  f  k  )  f  ■?  1 

(2.8)  EtVi(n)]  =  tTH  ^  E(Rkj-R..)  =  n%  ^  r(Pr  -  P^  ; 

Let  0^  =  ...  =  =  ®  and  =  9  +  ,5(n). 


r  =  (r-l}  JF  (x-8)[l  -  F(x-0)j  4 dF(x-0-6( n) ) 


l  ^ 


(x)tl  -  F(x)]  dF(x-S(n)). 


Note  that  we  do  not  lose  any  generality  by  letting  0=0. 


P-k>  =  {  Fr_1(x  +  6(n))Cl  '  1  (x  +  5(n))]k_rdF(x) 

*  Cr-1)  f  Fr_1(x  +  6(n))[l  -  F(x  +  6(n))]k_rdF(x) , 


PrX)  =  (r-2}  j  ^  2(x)F(x  "  fi(n))[l  -  F(x)]k  rdF(x) 

+  (k“2)  f Fr"1(x)[l  -  F(x)]k"r-1[l  -  F(x  -  6(n))]dF(x) 


and  combining  P^k ^  and  p£P^  and  taking  the  limit  yields 


(2.9) 


lin,  f  { [  1-F(X)  )dF( X ) 

+  (k~2)5  f Fr"2(x)[l-F(x)]k'r  f2(x)dx 
r-2 


2 ) 6  Fr-P(x) [l-F(x)]k-r_1  f2(x)dx  , 
?-l 


hence  (2.5)  is  obtained  from  (2.8).  Define 


/•  fX..  has  the  l  rank,  and  X..  has  the 

P11’3'  =  P  lF  31  ?Q(l,n) 

q*  rank  among  X^,...,X^ 


-gar'W  MWKt.  • 
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Note  that  P^1’^ 


Then 


0  (i  i  j,  1  5  J>  1  k). 


(2.10)  lim  P 


(i>j)  1 

and  lim  P.  ,  =  — - 

k(k-l) 


(For  details  see  Lee  and  Dudewicz  (1974).)  From  (2.10),  a 
computation  shows  (2.6)  and  (2.7).  ■ 


Thus  we  have  obtained  asymptotic  moments  of  V^(n) 

(1  -  i  -  k-1).  To  evaluate  P[CS | Py ( 1 , n) ] ,  we  need  to 
obtain  an  asymptotic  distribution  of  V(n)  =  ( V-^  (n), . . . ,  Vj^,  ( n )  )  '  . 
We  can  show  that  any  linear  combination  of  ( V^(n),. . . ,V,  ^(n))' 

has  an  asymptotic  normal  distribution  by  a  Lindeberg-Feller 
type  central  limit  theorem  (specifically  see  §26  of 
Gnedenko  and  Kolmogorov  (1949)).  Thus  V(n)  F  as  an  asymp¬ 
totic  (k-l)-variate  normal  distribution  with  certain  known  mean 
and  variance-covariance.  The  following  lemma  is  proven  in 
Lee  and  Dudewicz  (1974). 


Lemma  2.4:  The  ( k- 1 )-variate  random  vector  ( V^( n), . . . , ^ ( n) ) ' 
has  an  asymptotic  (k-l)-variate  normal  distribution  with  mean 
6k  [ f^(x)dx  and  variance-covariance 


where  5 . .  = 
il 


a.  i  =  k(k  +  l) (1  +  6  .  . )/12 


1  if  i  =  j 
0  otherwise. 


Now  we  are  prepared  to  approximate  P[CS j Py ( 1 ,n) 3 .  L 
(U^ , . . . *  be  a  (k-l)-variate  normal  random  vector  satis 
tying  E(Ui)  =  0,  =  (l  +  6i-)/2.  Then 

P[CS| Py,$0<l,n)] 

=  P  {  [k(k+l)/6]"?s[Vi(n)  -  6k  |  f2(x)dx] 

>  -[k(k+l)/6]  ^ 6 k  J  f2(x)dx,  l^i  5  k-1 |^Q(l,n) } 

*  P(U|  >  -[k(k+l)/6]”Js6  k  J  f2(x)dx,  l^i<k-l}. 

Letting  A  >  0  satisfy 

(2.11)  P(Ui  >  -A,  1  i  i  i  k-1)  =  P*, 

we  find 

Theorem  2.5:  For  P[CS | Py ( 1 ,n) ]  to  be  asymptotically  P* 
(1/k  <  P*  <  1),  6(n)  should  satisfy 

lim  n^n)  =  [k  ( k+1 )  /  6  ]**<  k  [  f  2  (x  )dx ) _1  A  . 

n->oo  J 

We  make  several  remarks  on  implications  of  Theorem  2.5: 

(i)  if  lim  n's6(n)  =  ®,  then  lim  P[  C5  |  Py  ,  ( 1 ,  n)  ]  =  1, 

n-M»  n-«° 

and  if  lim  n^i ( n)  =  0,  then  lim  P[CS  |  Py  ( 1  ,n)  ]  =  1  /k 
n-*-«*> 

that  is  if  6(n)/0(n  ’s) ,  then  P[CS | Py ,?q ( 1 ,n ) ]  converges 
either  to  1  or  1/k,  in  which  case  we  cannot  relate  n 
and  6*  (as  6*  -+■  0)  for  fixed  P*  (for  the  cases  of 
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P j^p  and  joint  rank  procedures  (i) 

is 

implicit  in 

Lehmann  (1963)); 

(ii) 

_  e 

when  6(n)  =  0(n  "*),  we  can  relate 

n 

and 

6*  for 

given  P'v  via  (n^(P^)  =  approximated 

n  ) 

(2.12) 

nA ( ? V )  =  (A/6*)2[k(k+l)/6][k  j 

2(x)dx]  2; 

(iii) 

consider  the  question  how  good  nA(P 

V> 

is; 

namely 

letting  nq’p(jE  denote  the  sample  size 

which 

will 

guarantee  P[ CS | Py , ( 1 ) ]  >  P*,  does 

nA(PV 

)/nTRUE 

converge  to  1  as  6"  •+  0?;  the  answer  is  conjectured 
to  be  affirmative  (see  Lee  and  Dudewicz  (1974));  and 

(iv)  the  conjecture  of  (iii)  justifies,  in  part,  dropping 
the  randomization  part  of  P[ CS | Py , ( 1 , n ) ]  as  we 
did  earlier.  [Such  dropping  has  been  done  without 
justification  in  the  literature,  e.g.,  p.  270  of  Lehmann 
(1963),  p.  295  of  Puri  and  Puri  (1968),  p.  623  of  Puri 
and  Puri  (1969),  p.  377  of  Bhapkar  and  Gore  (1971),  and 
p.  258  of  Alain  and  Thompson  (1971)  among  others.] 

The  P^p  version  of  (2.12)  is  due  to  Lehmann  (1963).  Namely, 
let  m  be  the  sample  size  for  P[CS  |  PMp  ,  (1  ,m)  ]  =  P*  asymp¬ 
totically,  Then 

(2.13)  m  =  2(Ao/6*)2 

2 

where  a  is  the  variance  of  the  underlying  df,  A  satisfies 
(2.11),  and  6'!  =  -  9  [-  ^  "  [(2.13)  is  the  equation  (11) 

of  Lehmann  (1963).]  Note  that  when  the  underlying  df  is  normal, 
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(2.13)  is  the  sample  size  obtained  by  Bechhofer  (1954),  and  is 
thus  exact. 

The  results  in  this  section  apply  so  far  only  to  the  se¬ 
lection  of  the  best  population  under  $Q(1),  but  can  be  extended 
to  the  t  best  selection  problem  under  configuration  £Q(t). 

The  proof  is,  of  course,  more  complicated  so  we  will  state  the 
results  corresponding  to  Lemma  2.4  and  Theorem  2.5  and  refer  to 
Lee  and  Dudewicz  (1974)  for  proofs.  We  have 

(2.14)  P[CS|Pv,$0(t,n)] 

-  P[H£(n)  -  Hi(n)  >  0,  k-t  +  liUSk,  1<  i  <k-t] 

=  ptv£i<n)  >  0.  k-t  +  l<£<k,  l<i<k-t] 

where 

V?i(n)  =  n-Js(H£(n)  -  H.(n)  )  (k-t+1  f  £  <  k,  l<i<k-t). 

Lemma  2.6:  . . .  >Vk  k_t(n)  )  '  is  a  t(k-t)- 

variate  random  vector  the  limiting  distribution  of  which,  under 
the  configuration  3g(t,n),  is  the  distribution  of  a  t(k-t)- 
variate  normal  vector  (U^;  k-t  +  1  £  i  i  k,  1  £  i  i  k-t)  '  with 
E(Ujj,i )  =  5k  |  f2(x)dx,  Var(UJli)  =  k(k+l)/6,  Corr(  U^.  ,U£i  i  )  = 
Corr(U£i,U£,i)  =  1/2,  Corr (U^ ,1^ .  .  .  )  =  0,  (k-t+1  £  U  i'  £  k, 

1  £ i  /  i'  £  k-t),  where  6  =  lim  n^6 (n) . 

n  *■<*> 


l 

( 
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Let  (U^  , .  .  .  >Uk_t  ,Wk-t+l  *  ‘  '  ‘  ’^k-1^ '  be  a  ^ k-1)- variate 
normal  random  vector  with  E(U^)  =  E(Wj)  =0,  Var(lK)  = 

VarCWj )  =  1,  CorrOK.lK  .  )  =  Corr( VK ,W.. . )  =  1/2, 

Corr  (IK  ,  Wj  )  =  -1/2  (l£i/i'5k-t,  k-t+1  S  j  /  j*  £  k),  and 

let  >  0  satisfy 

(2.15)  P*  =  tP[Ui  >  -At,  W.  >  0,  i<i<k-t,  k-t  +  1  £  j  £  k]. 

Theorem  2.7:  For  (2.14)  to  be  P*  (l/(^)  <  P"  <  1)  asymp¬ 
totically  under  (Jg(t,n),  6(n)  should  satisfy 

(2.16)  5  =  lim  n'ii5(n)  =  [k(k+D /6  l^Ck  f  f^fxMx]  ^A^. 

n-*-“  > 

The  implications  of  Theorem  2.7  are  the  same  as  those  of 
Theorem  2.5.  Therefore  through  Theorem  2.7,  we  can  relate  n 
to  6*  and  P*  by 

(2.17)  nA(V  =  (At/<5*)2(k(k+l)/6)[k  j  f2(x)dx]~2 

(note  the  difference  between  A  and  A^  in  (2.12)  and  (2.17)) 
The  Pf^p  equivalent  of  (2.16)  is  due  to  Puri  and  Puri 
( 1969 )  and  is 

(2.18)  m  =  2(Ato/6*)2 

where  m  is  the  sample  size  for  P[CS | P^p (t ,m) ]  to  be  P*, 

2 

a  is  the  variance  of  the  underlying  df,  and  A^  satis¬ 
fies  (2.15).  (2.18)  is  the  equation  (4A.11)  of  Puri  and 

Puri  (1969). 


m 


■zrzrmt**--  ** 
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In  this  section  we  have  studied  P[CS|Py]  under  the  SPC; 
namely  how  to  relate  the  necessary  sample  size  to  the  minimum 
discrepancy  worth  detecting  and  the  required  P[CS]  for  large 
sample  size  under  the  assumption  that  the  underlying  df  is 
known.  Similar  results  for  joint  rank  procedures  were  ob¬ 
tained  by  Lehmann  (1963)  and  Puri  and  Puri  (1969). 

3.  ARE  of  Py  under  the  SPC. 

Suppose  there  are  two  different  selection  procedures  P-^ 
and  P^  with  the  same  probability  requirement.  We  define  an 
asymptotic  relative  efficiency  (ARE)  of  Pj  with  respect  to 


(3.1) 


ARE( P, ,P„)  =  lim  ,  ^ 

1  1  <5 *-*-()'  Sample 


/  Sample  size  for  P 2  \ 
\  Sample  size  for  / 


To  determine  the  ARE  this  way,  we  should  be  able  to  determine 

a  sample  size  for  given  6*  and  P*.  We  noted  that  when  we 

let  6'  =  6(n),  and  lim  n^6(n)  =  c  (an  appropriate  con- 

n-*-® 

stant),  P[CS]  converges  to  P*  as  n  -+■  ®.  In  other  words, 

by  letting  n  =  n(<$*)  and  requiring  lim  tn(6*)]ii6*  to 

6*-0 

converge  to  some  constant,  P[CS]  converges  to  P*  as 
6*  -*>  0.  Thus  letting  np  (6*)  (i  =  l,2)  (the  selection 

i  . 

sample  size  for  P^  for  a  given  6  )  satisfy 

lim  [n„  (S*)]1^*  =  c.  (i  =  l,2),  we  can  determine  the 

6  *-*■()  i  1 

ARE(P.,P„)  (as  6*  -*■  0).  One  may  suspect  that  ARE(P, ,P9) 


(as 


■*  0)  and  ARE(P-^,P2)  (as  np 


®)  may  be  differ¬ 


ent.  [Note  that  the  latter  quantity  was  used  by  Lehmann  (1963) 


.A-." 


to  compute  the  ARE  of  a  joint  rank  procedure  with  respect  to 


P^p.]  However  we  can  show  their  equivalence  as  follows.  If 

for  given  P*  and  np  (i  *  1,2)  6(np  )  is  determined  so 

i  1 

that  lim  np  6(np  )  =  (i  =  l,2),  then  PCCSjP^]  4  P* . 
np  i  i 
i 

But  since  P^  and  P 2  are  required  to  satisfy  the  same 

probability  requirement,  we  have  6(np  )  =  6(np  ),  and 

1  2 

\c  l< 

also  lim  n p  sS(np  )  =  and  lim  r.p  S(rip  )  =  c^.  Note 

np  -*■<»  rl  ±  np  ■*<*>  2  2 

1  2 

that  as  np  -+  5(np  )  =  6(np  )  -*•  0  and  thus  np  -+  «. 

*2  2  *1  1 

Therefore  we  have 


(3-2)  ARE(P1,P2)  =  ARE(P1,P2)  =  ARE(P1,P2>. 

6*-+0  6 ( np  )=6(np  )->0  np  -+<*> 

1  r2  k2 

By  combining  (2.17)  and  (2.18),  we  can  compute  the 
ARE  (  Py ,  P^p )  (as  6*  ■+  0)  under  $Q(t): 

(3.3)  ARE(Pv,PMp)  =  12k02[  j  f 2 ( x)dx]2/ (k+1 ) . 

S*-+0 

This  ARE(P^,Pj^p)  is  tabulated  in  Table  3.1  for  several  df's. 

Table  3.1 


df 

ARE 

k=  2 

k=  3 

k  =  5 

k=10 

k=  30 

Rectangular 

k/ (k+1 ) 

.66667 

.75000 

.83333 

. 90909 

.96774 

Normal 

3k/  [  (k+l)ir  ] 

.  63662 

.71620 

.79578 

.86812 

.92413 

Log: stic 

kn2/C9(k+l)] 

.73108 

.82247 

.91385 

.99693 

1.06125 

Laplace 

3k/(2k+l) 

1.20000 

1.28591 

1.36364 

1.428S7 

1.47549 

Lower  bound 

. 864k/(k+l) 

.57600 

.64800 

.72000 

.78545 

.  83629 

*  The 

lower  bound  for 

12a2[  | 

'  f2(x)dx32 

was  obtained  by 

Hodges 

and 

Lehmann  (1956) 

for  the 

location  parameter 

case . 

Hodges  and  Lehmann  <1962)  aligned  observations  so  that  they 
are  free  of  block  effects,  and  applied  joint-rank  procedures 
to  random  block  designs.  Likewise  we  can  align  observations, 
apply  joint-rank  selection  procedures,  and  thus  obtain  better 
efficiencies  (in  the  order  of  (k+l)/k).  But  there  are  cases 
where  alignments  of  block  effects  are  not  applicable,  e.g. 
p.  485  of  Hodges  and  Lehmann  (1962). 

In  passing  we  can  note  that  Lehmann's  lemma  (Lemma  1  of 
Lehmann  (1963))  which  leads  to  (2.13)  (and  hence  to  (2.18) 
as  well),  is  only  justified  heuristically .  We  now  give  a 
proof.  We  need  the  following  generalized  Helly-Bray  Lemma: 


Lemma  3.1:  (Generalized  Helly-Bray  Lemma).  Let  -+■  Q,  a 

continuous  df  of  a  random  variable,  and  let  (g  },  g,h  be 
continuous  functions  satisfying 

(i)  |gn(x) I  -  h(x)  for  a11  * 

(ii)  gn(x)  +  g(x)  uniformly  on  finite  intervals,  and 

(iii)  f  h  dQ^  -*  f  hdQ. 


Then 


\  SndQn 


dQ. 


Proof 


See  Lemma  7.1.1  of  Johnson  and  Roussas  (1970). ■ 
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Theorem  3.2:  (Lemma  1  of  Lehmann  (1963).)  Let  A  satisfy 

(2.11)  and  let  ^(i)  be  a  sample  mean  (based  on  the  sample 

size  ’>  )  yielded  by  (the  population  associated  with 

0 [ i -j )  ,  and  let  a  be  a  variance  of  the  underlying  df  F 

with  a  pdf  f  .  Under  the  configuration  5^(1, n),  if 

lim  n  <5(n)  =  2^Ao,  then  we  have 
n-*-°° 

lim  P[X(k)  -  X(i)  ,  1  S  i  U-l  |  ( 1  ,n )  ]  =  F*  . 

n-** 


Proof 


Let  lim  n  6(n)  =6  (>  0).  Assuming,  without  loss  of  gener- 


n-fto 


ality,  that  E(X(.j.j)=  0[^]> 


lim  P[X 

n-+co 


(k)  -  X(i) »  1  -  1  -  k_1  I  ?0(l,n)] 


=  lim  P 
n^°° 


X(k)  ~  9Ck]  „  X(i)  ~  9[i]  _  6[kJ  ~  9[i]  ^  1<i<k. 

a//n  a//n  a//n 


Let  Yi(n) 


n'i<xU)-6<i)) 


(li  i  i  k-1)  ana  let  Y.j.(n)  be 


distributed  as  F  ( • ) .  Then  since  the  second  moment  of  the 

n 

underlying  df  exists,  F  (y)  converges  to  <Ky)  uniformly 

fy 


for  all  y  as  n  -►  °°»  where  $(y)  = 


<  2 tt )  5  exp[-x^/2]dx. 


Thus  for  every  given  e  >  0  there  exists  an  integer  n^(e) 


such  that,  whenever  n  -  n^(e). 


|Fn(y  +  6/a)  -  4>(y  +  <$/o)|  <  e/2. 


Now  by  the  continuity  assumption  there  exists  an  integer  n2(e 


- -  J;  -sr-TST***—  »  *  T  -'T'-r-r-T 
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i  i 


such  that*  whenever  n  i  n^e)  i 

max  ^  |  dFn(y)  ,  J  dF^Cy)  ^  <  e/2, 

y£(6/a  ,  n^fifnj/o)  ycCn^fifnJ/o  ,  6/a ) 


Hence,  for  all  n  >  max  (  n^(e),n2<e)  )  , 


I Fn<y  +  n^SCni/a)  -  $ ( y  +  6 /a ) |  <  e. 


Therefore  we  have 
k-1 


n 


F  (x  +  n^fi (n)/o) 


1(x  +  6/a)  , 


i  =  1 


and  hence 


lim  P[X(k)  -  x(i)>  1-i-k-l  |  ?0(l,n)  3 
n-*®  x ' 


=  lim  P[Yk(n)  >  Yi(n)  -  nh6(n)/a,  1  <  i  S  k-1] 
n-*°°  ,v 


lim  j  [n  Fn(y + dFn(y> 


f  t"-1 


(x  +  <$/cr)d<t>(x) . 


The  last  equality  is  due  to  the  generalized  Helly-Bray  Lemma. 
Thus  letting  6  =  2^0,  where  A  satisfies  (2.11),  the  Theorem 
follows .  ■ 


Note  that  if  5=0  or  then 


P[X(k)  >  X^j,  1  -  i  -  k-1  I  8g ( 1  ,n )  ]  converges  to  1/k  or  1 


respectively. 
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4.  LFC  and  Counterexamples. 

The  configuration  of  6^'s  which  minimizes  P[CS]  for  any 
given  selection  procedure  is  called  the  least-favorable 
configuration  (LFC),  The  SPC,  $Q(t),  is  often  least-favorable 
for  selection  procedures  in  the  indifference-zone  approach, 
and  the  equal- parameter  conf iguration  (EPC)  0[i]z  *  •  •  =9[k] 
is  often  least- f avorable  in  subset-selection  approaches. 

Rizvi  and  Woodworth  (1970)  showed  that 

inf  PECS]  <  PECS  |  <L(t>]  (inf  PECS]  <  PECS|EPC])  for 

0QU*,t)  “0 

selection  procedures  based  on  joint  ranks  in  the  indifference- 
zone  approach  (the  subset-selection  approach)  for  some  df's. 
And  McDonald  (1972)  also  showed  that  inf  PECS]  <  PECS|EPC] 

for  one  of  his  subset-selection  procedures  based  on  vector 
ranks.  In  this  section  the  counterexamples  of  Rizvi  and 
Woodworth  (1970)  are  modified  to  show  that  the  SPC,  <£Q(t), 
is  not  the  LFC  for  Py.  We  consider  two  counterexamples: 
first,  for  the  case  of  fixed  6*  and  finite  n;  and  second, 
for  the  case  of  6*  -+  0  (and  thus  n  -*•  ®)  . 

Counterexample  4.1:  Let  k  =  3,  t  =  1  and  F  be  a  continuous 
df  which  places  probabilities  of  q  and  p  (=  1-q)  uniformly 
on  the  intervals  (Q,e)  and  (1,1+e)  respectively,  where  e 
(<  1/3)  is  a  positive  constant .  Let  6*  =  e ,  0  5  6^5  6*,  and 

^(6^^  =  ”  (0  ,<$2  +  »  where  0^  is  the  location 

parameter  for  v.  (i  =  1,2,3).  Then  for  n  =  1  PECSiPy,^^)] 
is  a  constant  for  any  6?  and  for  n=  2 


!"*  SaT®**"  — i- 
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max  PtCS|Pv>'5(52)]=  P[CSjPv,l(0)l  and  min  P[CS|PV,3<62>] =  P[CS|Py,S(5*)]. 
S2  62 

This  constitutes  a  counterexample  because  3(0)  =  (0,0,6*)  is 
a  SPC  while  3(6*)  =  (0,6*, 26*)  is  not. 

Proof 

The  supports  of  the  distribution  of  the  populations  under  the 
parameter  configuration  3(62)  can  be  depicted  as  in  Figure  4.1, 
where  "heights"  show  the  supports  of  df's  under  3(6,,). 

Figure  4.1  Supports  of  df's  under  3(62) 


«2  6*  26*  36* 


1  1+6*  1+26*  1+36* 


Note  that  (the  best  population)  is  separated  from  and 
in  its  support  while  and  tt2  do  not  have  disjoint  supports. 

Fix  n  =  2.  Let  be  0,  1,  or  2  according  as  0,  1,  or  2  ob¬ 
servations  from  tu  are  in  the  upper  interval  of  the  support 
of  its  distribution,  let  B=  (B^jB^B^),  and  let  b=(b^,b2,b2) 

3  b.  2-bi 

be  a  realization  of  B.  Clearly  P[B=bJ  (0.  ,8_,6,)]  =  n  (w  )p  q 

1  1  d  i=l  x 


Let  R  = 


R11  R12  R13 

R21  R22  R23 


be  the  matrix  of  ranks  each  row  of 


which  is  a  row  vector  of  ranks  =  1,2,3  and  let 
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r  -  Til 

r12  r13  \  , 

be  a 

\r21 

r22  r2  3  1 

R  =  r  a 

CS  (selection  of 

r13  + r23  >  max(r12 +  r22  ’  rll  + r21) ’  with  probability  1/2  if 
either  +  r?3  =  r12  +  r?2  >  r±1  +  r?1  or  r13  +  r^  =  r,  +  r  >  r.  ?  +  r.  , 


with  probability  1/3  if  r13  +  r23  =  ri;L  +  r21  =  r12  +  r22, 


12  22 
and 

with  probability  0  otherwise.  The  conditional  probability 
that  R  =  r  given  B  =  b  under  6 ? )  involves  27  possible 

rank  combinations.  Many  of  the  possible  rank  combinations  are 
not  equally  likely  (hence  our  situation  differs  from  those  of 
Rizvi  and  Woodworth  (1970)  and  McDonald  (1972),  where  the  rank 
combinations  are  equally  likely).  One  example  of  the  computa¬ 
tions  is  that  of  P[CS  |  ?v,^(62)  ,B=b]  for  b=  (0,1,1).  b=  (0,1,1) 
means  that  for  both  observations  are  from  the  lower  sup¬ 

port,  and  for  tt2  and  tt ^  one  of  two  observations  is  from 
the  upper  support;  this  can  be  expressed  as: 

!*]  ,  [52+5*,62+2SA] 


2,1+62+6*],  [5 2+6*, 6 2+26*] 

6  2  +  6*]  ,  [l  +  62+6*,l  +  52  +  26*] 


originate  to  have  b=  (0,1,1).  Given  (i)  there  are  two  pos¬ 
sible  rank  combinations,  while  given  (ii)  there  are  two  other 

possibilities.  We  now  compute  the  probability  of  each  rank 
combination.  Let  5^  =  6*-  Then 


<n  / 

[0,6*], 

[62, 

l 

and 

[0,6*], 

[1  +  6 

(ii)  / 

[0,6*], 

[1  +  6 

[ 

[0,6*], 

[<52, 

express 

supports 

from 

-r 


1.  2,  3 
1,  2,  3 


=  {  PC0<X1<62,62<X2<62+6*,52+6#,<X3<62+26*] 


+P[62<X1<6* , 5  2<X1<X2<6* ,62+6*<X3<52+26*] 


+PC62<X1<5*}5*<X2<62+6ft,62+6*<X3<62+26*]} 


Similarly 


x  P[0<X1<6*,l+62<X2<l+62+6*,l+'S2+6*<X3<l+62+26*l 

=  qVd-6^/2S*2). 


P  (  ’  ’  )  =  qV  <5^/26*2,  P  f1’3’2)  =  qVd-5^/26*2),  and 

2,  3 '  J  \1,  2,  3  /  1 


P/1’3’  2)  =qV«2/26ft2. 
>2,  1,  3  '  J  1 


Thus 


P[CS|Pv,?(62)  ,  b=(0, 1,1)3  =  3/4  +  «S2/(8«5*2). 

For  the  other  cases  the  method  of  computation  is  similar.  In 
all  but  9  cases,  P[ CS | Py , 5( 6 2 ) ,b ]  equals  P[CS | Py , 5( 6* ) ,b] ; 
those  9  cases  are  listed  in  Table  4.1.  Now 


P[CS|Pv,$<62)]  -  P[CS| Pv,$(6*)] 


=  [1  -  62/(26*2)]62/(2<5*2)[(4/3)qV  +  (4/3)qV  +  (4/3)qpJ]  >  0, 


and  the  difference  is  monotone  increasing  in  62  for  0<  o^s  {* 
(namely  monotone  decreasing  in  6 2  for  05  $2  5  5*).  Thus  we 
conclude  that  PC CS | Py , 5( <5 2 ) ]  is  maximized  at  5(0)  (which  is 


a  SPC)  and  is  minimized  at  $(<$*).  The  case  of  n  =  1  is 
trivial  and  the  Counterexample  follows. 


Table  4.1 


b 

P[ B=b] 

P[CS|PV  ,  B=b  ,  ^(«2)] 

O 

j — 1 

o 

V-/ 

2q5p 

1/2 

+  62/(462) 

i — 1 

pH 

O 

„  4  2 

4q  p 

3/4 

+  6  2/ (  86  2 ) 

(1,0,0) 

2q  5p 

1  - 

62/(462) 

(1,0,1) 

4  2 

4q  p 

1  - 

62/ ( 8  6  2  ) 

(1,1,0) 

„  4  2 

4q  p 

1/6 

+  1/3[1  -  62/(  262)][62/(262)] 

(1,1,1) 

o  3  3 

8q  p 

3/4 

+  1/6[1  -  62/(  262  )][62/(262  j 

(1,2,1) 

2  4 

4q  p 

1/4 

+  5/24 ( 6  2/ 62) 

(2,1,1) 

2  4 

4q  P 

2/3 

-  5/ 24 ( 6  2/ 6  2 ) 

(2,2,1) 

2qp5 

2/3 

(1  -  62/62)[62/(  26  2  )] 

Note  that  6  =  6*  and  5^  =  6*-62*» 

We  now  show  that  the  SPC  is  not  the  LFC  for  Py  even 
for  large  samples.  One  method  of  showing  this  is  to  show  that 
the  ratio  of  P[CS|Py]  under  a  configuration  different  from 
the  SPC  to  that  under  the  SPC  converges  to  some  number  smaller 

JL 

than  1  for  fixed  6  as  n  -»■  «>.  Another  method  of  constructing 
a  counterexample  is  to  show  that  the  ratio  of  sample  size  for  a 
configuration  different  from  the  SPC  to  that  for  the  SPC  con¬ 
verges  to  some  number  smaller  than  1  for  fixed  6*  as  P*-*-l. 
However  we  have  obtained  a  counterexample  by  another  method 


25 


originated  by  Rizvi  and  Woodworth  (1970):  we  show  that  when 
the  relation  between  n  and  6(n)  (=  6j-k_t  +  1]  “  satis¬ 

fies  (2.16),  P[CS|Py]  converges  (as  n  ®)  to  some  number 
smaller  than  P  under  a  certain  configuration  of  8^'s  differ¬ 
ent  from  the  SPC,  $Q(t),  but  still  in  ft0(6*,t).  This  serves 
our  purpose,  because  when  the  relation  (2.16)  holds  between 
6(n)  and  n,  P[CS|Py]  converges,  as  n  -*■  to  P*  under  the 
SPC.  [One  may  ask  how  much  larger  P[CS|Py]  is  under  the  SPC 
than  under  the  configuration  we  will  consider;  this  question 
is  discussed  in  the  next  section.] 

Consider  the  selection  of  the  t  best  populations,  when 
the  underlying  df  is  a  logistic  distribution  with  a  location 
parameter.  For  simplicity  take  ki  4  (k  even)  and  t=  k/2. 
Without  loss  of  generality  drop  [  ]  around  the  ordered  pa¬ 
rameter  values  for  convenience  of  notation;  namely  take 
e[i]=0i’  ir(i):=7Ti’  and  thus  H(j)(n)=H^(n)  (15  isk). 

Lemma  4,2:  Let  F(x)  =  (1+e  X)  ^  and  let 

Vt,n>(  'i=  =  9k-t-l  =  -60’  6k-t  =  0>  ek-t*l=  5<n>’ 


°k-t<-2  =  ■  ■  '  =  6k  ‘  S0’ 


_Sj 

where  5(n)  >  0  and  is  in  the  order  of  0(n  ),  Oq  >  0 
fixed  satisfying  9^  >  6(n),  and  k  =  2t.  Then 


(4.1) 


lim  P[CS|PV  ,  (^(t.n)]  <  *[Atp(  (k+D/k)*5] 


n-*"> 


satisfies  (2.15) 


where 


9 


26 


(4.2)  HQ(2F-l)dF}/{  J  H2dF  -  (  j  HgdF)2)**, 

(4.3)  H0(x)  =  (k-t-l)F(x-Qg )  +  2F(x)  +  (t-l)F(x+0Q) , 
and 

lim  n'i(S(n)  -  [kfk+D/B]*5  A^C  kf2(x)dx]  ^ . 

n-H» 

Proof 

For  large  samples,  dropping  the  randomization  part  we  have 

(4.4)  P[CS|Py  ,  (L(t,n)]  -  P[  max  H.(n)<  min  H.  (n)  |  (t,n)] 

l<i<k-t  1  k-t<j*k  3  1 

-  P[V(n)  >  0  |  i^(t,n)3, 

where  V(n)  =  n~ ^ ( Hk_ t + ^ (n )  -  H^_t(n)). 

We  will  find  an  upper  bound  for  (4.4)  as  n  -»•  «>  by  finding 

lim  E[V(n)3  and  lim  Var[V(n)],  and  applying  a  Lindeberg- 
n  _>'co 

Feller  type  central  limit  theorem.  The  computations  for 
E[V(n)3  and  Var[V(n)]  are  lengthy,  and  thus  are  omitted.  In 
the  limiting  process  using  Olshen's  lemma  (Lemma  (12)  of 
Olshen  (1967)),  we  have 

lim  ECV(n)  [  ti  (t,n)]  =  [  6  (k  +  1 )  /k  ]**  A  f  Hn(x)t2F(x)  -  l]dF(x) , 

where  HQ(x)  is  given  by  (4.3),  and  (0  <  0Q  <  C(k,t,F)) 
lim  Var[V(n>|5  (t,n)]  >  2[  !I2dF  -  (  ftt-dF}2] 

n-»  1  /l  J  1 


(4.5) 
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and  t>2,  where  Hj  =  (k-t-l)F(x+60)  +  2F(x)  +  (t-l)F(x-0o  )  . 
Since  we  assume  k=2t,  we  have  H1  i  HQ .  Thus  from  (4.4) 

lim  P[CS|Py  ,  $i(t,n)]  i  lim  P[V(n)  >  0  |  8.  (t,n)1 

n-*30  ry+<x>  1 

f V(n)-lim  E[V(n) ]  lim  E[V(n)]  1 

=  lim  P^  - - -  -  > _ _ _  \ 

n-*«  [  lim/7ar[V(n)]  limAar[V(n)]  / 

i  <t>[Ap((k+l)/k),s], 

where  the  last  inequality  is  due  to  the  asymptotic  normality 
of  {V(n>  -  lim  E[V(n)  ] )/ {lim  VartV(n) due  to  a  Lindeberg- 
Feller  type  central  limit  theorem  and  (4.5).  » 

Lemma  4.3:  For  any  k  and  t,  l<t<k, 

lim  1 ( P*) /a  =  l 
P*+l  t 


where  At  satisfies  (2.15), 

Proof 

This  follows  from  Lemma  2  of  Rizvi  and  Woodworth  (1970)  upon 
noting  that  At,  which  satisfies  (2.15),  also  satisfies 

+  Zi  <  min  zi  +  /2  A.]  =  P* 
l<i<k-t  1  k-t<jSk  3  1 

where  Z±  (1  <  i  <  k)  are  independent  standard  normal  ran¬ 
dom  variables.  [For  the  case  t=l,  Dudewicz  (1969)  also 
obtained  the  result  of  Lemma  4.3  in  a  different  form.] 
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Counterexample  4.4:  Under  the  same  setup  as  in  Lemma  4.2, 
lim  P[CS|Pv,$,(t,n)3  <  P*  =  lim  P[CS | Py ( t ,n) 3 . 

n-Mo  n->o>  v  'J 

Proof 

Note  that  0  5  p  <  1,  since  p  =  corr( Hq , 2F-1 )  and  HQ  and 
2F-1  are  monotone  increasing  in  x  for  fixed  0 ^ .  Choose 
PK  and  k  large  enough  such  that  C  /  <&  ^ ( P* ) 3 ( k+1 ) /k  <  1/p. 
Substituting  this  into  (4.1),  the  inequality  follows.  The 

equality  is  due  to  Theorem  2.7.  • 

Through  Counterexamples  4.1  and  4.4,  we  have  seen  that 
the  SPC  minimizes  P[CS|Py3  neither  when  one  has  a  fixed 
sample  size  nor  when  one  lets  6*  (=  0^  -t  +  i]  ”  0[k-t3^  'tend 
to  zero  as  n  -*■  «°.  Note  that  the  logistic  df  possesses  a  mono¬ 
tone  likelihood  ratio  with  respect  to  its  location  parameter 
and  has  a  support  independent  of  its  location  parameter;  thus 
imposing  additional  conditions  such  as  the  above  two  will  not 
obviate  the  difficulty  in  the  LFC. 

It  is  an  open  question  whether  (for  selection  of  the  t 
best  by  Py) 

inf  PCCS|P„3  =  P[CS|P.,,$n(t)3. 

Wg ( 6  * , t )  V  U 


5.  Remarks  on  Selection  Procedures  based  on  Ranks. 

In  the  literature  of  selection  procedures  based  on  ranks 
(either  joint  ranks  or  vector  ranks)  each  contribution  either 
imposes  artificial  restrictions  on  the  parameter  space  (Puri 
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i  ; 

i  i. 

i 


and  Puri  (1968),  (1969),  Gupta  and  McDonald  (1970),  and 
McDonald  (1972),  (1973)),  or  is  not  able  to  find  the  LFC 
(Blumental  and  Patterson  (1969)),  or  was  partially  invalidated 
by  Rizvi  and  Woodworth  (1970)  (Lehmann  (1963),  and  Bartlett 
and  Govindarajulu  (1968)).  A  conjecture  as  to  why  these  pro¬ 
cedures  were  invalidated  is  that  the  LFC's  for  them  were 
sought  in  a  parameter  space  where  the  P[CS]  for  certain 
parametric  procedures  is  monotone  while  the  P[CL]  for  rank 
procedures  is  not  monotone  (as  is  indicated  by  Gupta  and 
McDonald  (1970)  and  Blumental  and  Patterson  (1969)). 

For  any  procedures  based  on  joint  ranks  or  vector  ranks, 

PRANK’  deflne 


R 


ID 


inf 

«fl(6*,t) 


P[CSlPRANKJ-P[CSlPRABK-50(t>] 

P[CSlPRANK’^0<t>-1 


x  100 


for  the  indifference  zone  approach,  and 


inf  P[CS I PRANK^  '  PtCS I PRANK’EPC] 

R  =  _§ - -  x  100 

P[CSIPRANK'EPC] 

for  the  subset-selection  approach.  Then  the  quantities  Rj^ 
and  Rgg  merit  study  because  small  and  R^g  may  well 

justify  the  SPC  assumption  (which  will  simplify  theoretical 
development)  while  large  Rj^  and  Rgg  imply  that  the  SPC 
assumption  may  be  of  only  theoretical  interest.  [This  aspect 
was  called  to  our  attention  by  Dr.  Gary  C.  McDonald.] 


» 
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We  have  noted  in  Section  4  that  Py  also  suffers  in  the 
LFC  unless  the  SPC  is  assumed.  Hence  we  wish  to  compute  RT^ 
in  the  case  of  Counterexample  4.1  (where  n  =  2,  k  =  3,  and 
the  LFC  is  relatively  simple).  Our  results  on  for 

p  =  ,01(.01).99  (and  some  typical  values  of  P[CS | Py ,SPC] ) 
are  summarized  in  Table  5.1.  The  minimum  of  P[CS|Py»SPC] 
is  .66146  (occurring  at  p  =  .50)  and  the  maximum  is 

3.11234%  (occurring  at  p  =  .77)  out  of  the  cases  studied. 
These  computations  indicate  that  the  assumption  of  the  SPC 
as  an  underlying  configuration  may  not  be  unreasonable.  We 
propose  that  further  study  of  and  Rgg  be  carried  out 

to  see  in  how  far  this  result  generalizes  to  other  cases. 


Table  5.1  R-^ 


p 

P[CS|PV,SPC] 

rid  (%) 

.01 

. 98991 

. 00327 

.10 

. 89555 

. 27168 

.20 

. 79887 

. 86522 

.30 

.72492 

1.49768 

.40 

. 67910 

1.99793 

.  50 

.66146  (minimum) 

2.36220 

.60 

. 67G14 

2 .69315 

.70 

. 70420 

3.01292 

.77 

. 74403 

3.11234  (maximum) 

.80 

. 76559 

3.07632 

.90 

. 86051 

2.31874 

.99 

.98363 

.32231 
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6.  Discussions 

^  We  have  studied  mathematical  properties  of  the  vector  rank 
procedure  when  applied  to  selecting  the  largest  location  parameter 
in  randomized  block  models.  Even  if  the  data  is  not  quantitative 
but  ordinal,  or  is  not  from  a  location  model  but  from  a  stochas¬ 
tically  ordered  family  of  distributions,  the  vector  rank  procedure 
is  applicable. 

An  important  competing  selection  procedure  is  based  on  the 
robust  estimate  of  location  parameters  (Sen  and  Puri  (1972)).  The 
selection  procedure  based  on  robust  location  estimates  does  not 
have  the  LFC  difficulty  that  the  vector  rank  procedure  suffers 
from,  and  its  relative  efficiency  compared  to  the  means  procedure 
is  that  of  the  Mann-Whitney-  Wilcoxon  test  versus  the  t-test.  A 
serious  disadvantage,  however,  is  that  the  robust  location  estimate 
method  is  not  applicable  if  the  data  is  ordinal  or  from  a  non¬ 
location  family:  for  example,  see  Lee  and  Dudewicz  (1980)  where 
the  data  is  incomplete  rank  order  scores  or  Lee  (1980)  where  the 
distributional  origin  of  the  data  is  not  known. 

We  now  discuss  how  to  choose  a  proper  selection  procedure  to 
be  applied.  If  the  data  is  from  a  location  family,  then  the  robust 
procedure  of  choice  should  be  based  on  robust  location  estimates. 

If  the  data  originates  from  a  location  family  but  in  the  ordinal 
form,  or  from  a  stochastically  increasing  scale  parameter  family,  then  the 
vector  rank  procedure  may  be  applied  to  selecting  the  population 
with  the  largest  parameter  of  interest.  In  this  latter  case,  it 
is  possible  that  the  P(CS)>_P*  requirement  is  not  met.  In  doubtful 
cases,  the  multinomial  category  selection  procedure  (Lee,  1980)  is 
a  possible  alternative. 


As  a  final  remark,  note  that  the  procedure  considered  here 
is,  like  most  robust  selection  procedures,  not  nonparametric  since 
the  required  sample  size  (say  (2.12)  or  (2.17))  depends  on  the 
underlying  distribution,  but  is  less  senstitive  to  departure  from 
the  assumed  underlying  distribution  than  the  means  procedure. 
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